Overview

Dataset statistics

Number of variables26
Number of observations87020
Missing cells260465
Missing cells (%)11.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory78.5 MiB
Average record size in memory946.4 B

Variable types

CAT11
NUM11
BOOL4

Warnings

City has a high cardinality: 697 distinct values High cardinality
DOB has a high cardinality: 11345 distinct values High cardinality
Lead_Creation_Date has a high cardinality: 92 distinct values High cardinality
Employer_Name has a high cardinality: 43567 distinct values High cardinality
Salary_Account has a high cardinality: 57 distinct values High cardinality
EMI_Loan_Submitted is highly correlated with Loan_Amount_SubmittedHigh correlation
Loan_Amount_Submitted is highly correlated with EMI_Loan_SubmittedHigh correlation
City has 1003 (1.2%) missing values Missing
Salary_Account has 11764 (13.5%) missing values Missing
Loan_Amount_Submitted has 34613 (39.8%) missing values Missing
Loan_Tenure_Submitted has 34613 (39.8%) missing values Missing
Interest_Rate has 59294 (68.1%) missing values Missing
Processing_Fee has 59600 (68.5%) missing values Missing
EMI_Loan_Submitted has 59294 (68.1%) missing values Missing
Monthly_Income is highly skewed (γ1 = 167.5605262) Skewed
Existing_EMI is highly skewed (γ1 = 211.7693511) Skewed
ID has unique values Unique
Loan_Amount_Applied has 28853 (33.2%) zeros Zeros
Loan_Tenure_Applied has 33844 (38.9%) zeros Zeros
Existing_EMI has 58238 (66.9%) zeros Zeros
Var5 has 29087 (33.4%) zeros Zeros
Var4 has 2546 (2.9%) zeros Zeros

Reproduction

Analysis started2020-09-25 13:25:50.047968
Analysis finished2020-09-25 13:26:11.343439
Duration21.3 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

ID
Categorical

UNIQUE

Distinct87020
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
ID102986A10
 
1
ID050808E30
 
1
ID122562Y20
 
1
ID059001H10
 
1
ID067100U00
 
1
Other values (87015)
87015 
ValueCountFrequency (%) 
ID102986A101< 0.1%
 
ID050808E301< 0.1%
 
ID122562Y201< 0.1%
 
ID059001H101< 0.1%
 
ID067100U001< 0.1%
 
ID056239B401< 0.1%
 
ID016151F101< 0.1%
 
ID096756K101< 0.1%
 
ID078931V101< 0.1%
 
ID086086A101< 0.1%
 
ID025475V001< 0.1%
 
ID104727Z201< 0.1%
 
ID055224A401< 0.1%
 
ID029214Q401< 0.1%
 
ID117978Q301< 0.1%
 
ID117695T001< 0.1%
 
ID073500Y001< 0.1%
 
ID039113J301< 0.1%
 
ID034692I201< 0.1%
 
ID051416O101< 0.1%
 
ID090308K301< 0.1%
 
ID019197J201< 0.1%
 
ID017587L201< 0.1%
 
ID080208Y301< 0.1%
 
ID039895L001< 0.1%
 
Other values (86995)86995> 99.9%
 
Frequencies of value counts

Unique

Unique87020 ?
Unique (%)100.0%
Histogram of lengths of the category

Length

Max length11
Median length11
Mean length11
Min length11

Overview of Unicode Properties

Unique unicode characters36
Unique unicode categories2 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
022327623.3%
 
I904059.4%
 
D903809.4%
 
1839318.8%
 
2626126.5%
 
3596276.2%
 
4595866.2%
 
8415544.3%
 
6415474.3%
 
5414404.3%
 
9413724.3%
 
7412154.3%
 
V34170.4%
 
J33870.4%
 
Y33790.4%
 
T33740.4%
 
S33670.4%
 
E33660.4%
 
B33550.4%
 
C33500.3%
 
W33470.3%
 
N33460.3%
 
K33450.3%
 
G33420.3%
 
U33390.3%
 
Other values (11)365613.8%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number69616072.7%
 
Uppercase Letter26106027.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
I9040534.6%
 
D9038034.6%
 
V34171.3%
 
J33871.3%
 
Y33791.3%
 
T33741.3%
 
S33671.3%
 
E33661.3%
 
B33551.3%
 
C33501.3%
 
W33471.3%
 
N33461.3%
 
K33451.3%
 
G33421.3%
 
U33391.3%
 
A33391.3%
 
F33371.3%
 
P33371.3%
 
Q33351.3%
 
O33341.3%
 
H33301.3%
 
R33231.3%
 
M33191.3%
 
L33071.3%
 
X33011.3%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
022327632.1%
 
18393112.1%
 
2626129.0%
 
3596278.6%
 
4595868.6%
 
8415546.0%
 
6415476.0%
 
5414406.0%
 
9413725.9%
 
7412155.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Common69616072.7%
 
Latin26106027.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
I9040534.6%
 
D9038034.6%
 
V34171.3%
 
J33871.3%
 
Y33791.3%
 
T33741.3%
 
S33671.3%
 
E33661.3%
 
B33551.3%
 
C33501.3%
 
W33471.3%
 
N33461.3%
 
K33451.3%
 
G33421.3%
 
U33391.3%
 
A33391.3%
 
F33371.3%
 
P33371.3%
 
Q33351.3%
 
O33341.3%
 
H33301.3%
 
R33231.3%
 
M33191.3%
 
L33071.3%
 
X33011.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
022327632.1%
 
18393112.1%
 
2626129.0%
 
3596278.6%
 
4595868.6%
 
8415546.0%
 
6415476.0%
 
5414406.0%
 
9413725.9%
 
7412155.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII957220100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
022327623.3%
 
I904059.4%
 
D903809.4%
 
1839318.8%
 
2626126.5%
 
3596276.2%
 
4595866.2%
 
8415544.3%
 
6415474.3%
 
5414404.3%
 
9413724.3%
 
7412154.3%
 
V34170.4%
 
J33870.4%
 
Y33790.4%
 
T33740.4%
 
S33670.4%
 
E33660.4%
 
B33550.4%
 
C33500.3%
 
W33470.3%
 
N33460.3%
 
K33450.3%
 
G33420.3%
 
U33390.3%
 
Other values (11)365613.8%
 

Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
Male
49848 
Female
37172 
ValueCountFrequency (%) 
Male4984857.3%
 
Female3717242.7%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length6
Median length4
Mean length4.854332337
Min length4

Overview of Unicode Properties

Unique unicode characters6
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e12419229.4%
 
a8702020.6%
 
l8702020.6%
 
M4984811.8%
 
F371728.8%
 
m371728.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter33540479.4%
 
Uppercase Letter8702020.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M4984857.3%
 
F3717242.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e12419237.0%
 
a8702025.9%
 
l8702025.9%
 
m3717211.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin422424100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e12419229.4%
 
a8702020.6%
 
l8702020.6%
 
M4984811.8%
 
F371728.8%
 
m371728.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII422424100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e12419229.4%
 
a8702020.6%
 
l8702020.6%
 
M4984811.8%
 
F371728.8%
 
m371728.8%
 

City
Categorical

HIGH CARDINALITY
MISSING

Distinct697
Distinct (%)0.8%
Missing1003
Missing (%)1.2%
Memory size680.0 KiB
Delhi
12527 
Bengaluru
10824 
Mumbai
10795 
Hyderabad
7272 
Chennai
6916 
Other values (692)
37683 
ValueCountFrequency (%) 
Delhi1252714.4%
 
Bengaluru1082412.4%
 
Mumbai1079512.4%
 
Hyderabad72728.4%
 
Chennai69167.9%
 
Pune52076.0%
 
Kolkata28883.3%
 
Ahmedabad17882.1%
 
Jaipur13311.5%
 
Gurgaon12121.4%
 
Coimbatore11471.3%
 
Thane9051.0%
 
Chandigarh8701.0%
 
Surat8020.9%
 
Visakhapatnam7640.9%
 
Indore7340.8%
 
Vadodara6240.7%
 
Nagpur5940.7%
 
Lucknow5800.7%
 
Ghaziabad5600.6%
 
Bhopal5130.6%
 
Kochi4920.6%
 
Patna4610.5%
 
Faridabad4470.5%
 
Madurai3750.4%
 
Other values (672)1538917.7%
 
(Missing)10031.2%
 
Frequencies of value counts

Unique

Unique79 ?
Unique (%)0.1%
Histogram of lengths of the category

Length

Max length24
Median length7
Mean length7.098724431
Min length3

Overview of Unicode Properties

Unique unicode characters56
Unique unicode categories8 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a9600115.5%
 
e512778.3%
 
u506918.2%
 
n439017.1%
 
i437107.1%
 
r377346.1%
 
h328155.3%
 
l314135.1%
 
d285544.6%
 
b236423.8%
 
m171432.8%
 
g168172.7%
 
o136202.2%
 
D133562.2%
 
B130502.1%
 
M122902.0%
 
t94121.5%
 
C92391.5%
 
y83101.3%
 
H79611.3%
 
P64981.1%
 
p63841.0%
 
k58400.9%
 
K48100.8%
 
A33590.5%
 
Other values (31)299044.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter52718785.3%
 
Uppercase Letter8831414.3%
 
Space Separator18180.3%
 
Decimal Number3820.1%
 
Other Punctuation17< 0.1%
 
Dash Punctuation11< 0.1%
 
Open Punctuation1< 0.1%
 
Close Punctuation1< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
D1335615.1%
 
B1305014.8%
 
M1229013.9%
 
C923910.5%
 
H79619.0%
 
P64987.4%
 
K48105.4%
 
A33593.8%
 
G30343.4%
 
N24132.7%
 
J21432.4%
 
V21202.4%
 
S19542.2%
 
T17402.0%
 
R11081.3%
 
L9101.0%
 
I8310.9%
 
F5160.6%
 
E3480.4%
 
U2910.3%
 
W2130.2%
 
O770.1%
 
Y530.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a9600118.2%
 
e512779.7%
 
u506919.6%
 
n439018.3%
 
i437108.3%
 
r377347.2%
 
h328156.2%
 
l314136.0%
 
d285545.4%
 
b236424.5%
 
m171433.3%
 
g168173.2%
 
o136202.6%
 
t94121.8%
 
y83101.6%
 
p63841.2%
 
k58401.1%
 
s32800.6%
 
w20980.4%
 
c20060.4%
 
j9390.2%
 
z8230.2%
 
v6840.1%
 
f90< 0.1%
 
x3< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1818100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
219150.0%
 
419150.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
&1694.1%
 
.15.9%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-11100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(1100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)1100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin61550199.6%
 
Common22300.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a9600115.6%
 
e512778.3%
 
u506918.2%
 
n439017.1%
 
i437107.1%
 
r377346.1%
 
h328155.3%
 
l314135.1%
 
d285544.6%
 
b236423.8%
 
m171432.8%
 
g168172.7%
 
o136202.2%
 
D133562.2%
 
B130502.1%
 
M122902.0%
 
t94121.5%
 
C92391.5%
 
y83101.4%
 
H79611.3%
 
P64981.1%
 
p63841.0%
 
k58400.9%
 
K48100.8%
 
A33590.5%
 
Other values (23)276744.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
181881.5%
 
21918.6%
 
41918.6%
 
&160.7%
 
-110.5%
 
(1< 0.1%
 
.1< 0.1%
 
)1< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII617731100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a9600115.5%
 
e512778.3%
 
u506918.2%
 
n439017.1%
 
i437107.1%
 
r377346.1%
 
h328155.3%
 
l314135.1%
 
d285544.6%
 
b236423.8%
 
m171432.8%
 
g168172.7%
 
o136202.2%
 
D133562.2%
 
B130502.1%
 
M122902.0%
 
t94121.5%
 
C92391.5%
 
y83101.3%
 
H79611.3%
 
P64981.1%
 
p63841.0%
 
k58400.9%
 
K48100.8%
 
A33590.5%
 
Other values (31)299044.8%
 

Monthly_Income
Real number (ℝ≥0)

SKEWED

Distinct5825
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58849.97435
Minimum0
Maximum444554443
Zeros314
Zeros (%)0.4%
Memory size680.0 KiB

Quantile statistics

Minimum0
5-th percentile10000
Q116500
median25000
Q340000
95-th percentile95000
Maximum444554443
Range444554443
Interquartile range (IQR)23500

Descriptive statistics

Standard deviation2177511.361
Coefficient of variation (CV)37.0010588
Kurtosis31361.57429
Mean58849.97435
Median Absolute Deviation (MAD)10000
Skewness167.5605262
Sum5121124768
Variance4.741555729e+12
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2500058236.7%
 
2000045235.2%
 
1500042464.9%
 
3000032163.7%
 
5000023922.7%
 
1800021402.5%
 
1000021362.5%
 
1200018952.2%
 
4000018562.1%
 
3500017982.1%
 
2200016081.8%
 
1600015291.8%
 
1700013361.5%
 
2300011461.3%
 
2100011131.3%
 
4500010341.2%
 
1400010171.2%
 
130009751.1%
 
320009711.1%
 
280009691.1%
 
1000009681.1%
 
600009651.1%
 
240009041.0%
 
270008741.0%
 
260008090.9%
 
Other values (5800)4077746.9%
 
ValueCountFrequency (%) 
03140.4%
 
17< 0.1%
 
21< 0.1%
 
105< 0.1%
 
111< 0.1%
 
123< 0.1%
 
133< 0.1%
 
145< 0.1%
 
154< 0.1%
 
161< 0.1%
 
ValueCountFrequency (%) 
4445544431< 0.1%
 
3838383831< 0.1%
 
1201001321< 0.1%
 
1000000004< 0.1%
 
549545451< 0.1%
 
500000001< 0.1%
 
400007851< 0.1%
 
340000001< 0.1%
 
262626261< 0.1%
 
200000005< 0.1%
 

DOB
Categorical

HIGH CARDINALITY

Distinct11345
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
11-Nov-80
 
306
02-Jan-70
 
226
01-Jan-70
 
148
01-Jan-90
 
131
01-Jan-80
 
111
Other values (11340)
86098 
ValueCountFrequency (%) 
11-Nov-803060.4%
 
02-Jan-702260.3%
 
01-Jan-701480.2%
 
01-Jan-901310.2%
 
01-Jan-801110.1%
 
01-Jan-86990.1%
 
01-Jan-89970.1%
 
01-Jan-85950.1%
 
01-Jan-88920.1%
 
01-Jun-85780.1%
 
01-Jun-86760.1%
 
01-Jan-91750.1%
 
01-Jan-87750.1%
 
11-Nov-88710.1%
 
01-Jan-84650.1%
 
01-Jul-86630.1%
 
01-Jul-89610.1%
 
01-Jun-88610.1%
 
05-Jun-89580.1%
 
10-Jun-86570.1%
 
01-Jun-87570.1%
 
01-Jun-90560.1%
 
01-Jun-84560.1%
 
01-Jul-87550.1%
 
01-Jun-89550.1%
 
Other values (11320)8469697.3%
 
Frequencies of value counts

Unique

Unique2499 ?
Unique (%)2.9%
Histogram of lengths of the category

Length

Max length9
Median length9
Mean length9
Min length9

Overview of Unicode Properties

Unique unicode characters33
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
-17404022.2%
 
8698848.9%
 
1511716.5%
 
0507246.5%
 
2418235.3%
 
9338084.3%
 
7276803.5%
 
J270663.5%
 
u263803.4%
 
a234573.0%
 
5198582.5%
 
6192562.5%
 
3188712.4%
 
n179272.3%
 
e178692.3%
 
M153642.0%
 
4150051.9%
 
A142131.8%
 
r132631.7%
 
c125221.6%
 
p123641.6%
 
l91391.2%
 
y89071.1%
 
g74070.9%
 
O63060.8%
 
Other values (8)488766.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number34808044.4%
 
Dash Punctuation17404022.2%
 
Lowercase Letter17404022.2%
 
Uppercase Letter8702011.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
86988420.1%
 
15117114.7%
 
05072414.6%
 
24182312.0%
 
9338089.7%
 
7276808.0%
 
5198585.7%
 
6192565.5%
 
3188715.4%
 
4150054.3%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-174040100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J2706631.1%
 
M1536417.7%
 
A1421316.3%
 
O63067.2%
 
D62167.1%
 
N62027.1%
 
F60957.0%
 
S55586.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
u2638015.2%
 
a2345713.5%
 
n1792710.3%
 
e1786910.3%
 
r132637.6%
 
c125227.2%
 
p123647.1%
 
l91395.3%
 
y89075.1%
 
g74074.3%
 
t63063.6%
 
o62023.6%
 
v62023.6%
 
b60953.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Common52212066.7%
 
Latin26106033.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
-17404033.3%
 
86988413.4%
 
1511719.8%
 
0507249.7%
 
2418238.0%
 
9338086.5%
 
7276805.3%
 
5198583.8%
 
6192563.7%
 
3188713.6%
 
4150052.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
J2706610.4%
 
u2638010.1%
 
a234579.0%
 
n179276.9%
 
e178696.8%
 
M153645.9%
 
A142135.4%
 
r132635.1%
 
c125224.8%
 
p123644.7%
 
l91393.5%
 
y89073.4%
 
g74072.8%
 
O63062.4%
 
t63062.4%
 
D62162.4%
 
N62022.4%
 
o62022.4%
 
v62022.4%
 
F60952.3%
 
b60952.3%
 
S55582.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII783180100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
-17404022.2%
 
8698848.9%
 
1511716.5%
 
0507246.5%
 
2418235.3%
 
9338084.3%
 
7276803.5%
 
J270663.5%
 
u263803.4%
 
a234573.0%
 
5198582.5%
 
6192562.5%
 
3188712.4%
 
n179272.3%
 
e178692.3%
 
M153642.0%
 
4150051.9%
 
A142131.8%
 
r132631.7%
 
c125221.6%
 
p123641.6%
 
l91391.2%
 
y89071.1%
 
g74070.9%
 
O63060.8%
 
Other values (8)488766.2%
 

Lead_Creation_Date
Categorical

HIGH CARDINALITY

Distinct92
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
03-Jul-15
 
2315
23-Jul-15
 
1994
30-Jul-15
 
1297
27-Jul-15
 
1292
31-Jul-15
 
1268
Other values (87)
78854 
ValueCountFrequency (%) 
03-Jul-1523152.7%
 
23-Jul-1519942.3%
 
30-Jul-1512971.5%
 
27-Jul-1512921.5%
 
31-Jul-1512681.5%
 
29-Jul-1512361.4%
 
20-Jul-1512311.4%
 
21-Jul-1512011.4%
 
22-Jun-1512011.4%
 
15-Jul-1511931.4%
 
28-Jul-1511911.4%
 
26-May-1511901.4%
 
18-Jul-1511881.4%
 
22-Jul-1511881.4%
 
23-Jun-1511871.4%
 
17-Jun-1511541.3%
 
04-Jun-1511321.3%
 
05-May-1511281.3%
 
06-Jul-1511261.3%
 
04-May-1510881.3%
 
29-Jun-1510881.3%
 
13-May-1510811.2%
 
18-May-1510781.2%
 
03-Jun-1510661.2%
 
27-May-1510641.2%
 
Other values (67)5584364.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length9
Median length9
Mean length9
Min length9

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
-17404022.2%
 
112275815.7%
 
59542012.2%
 
J600597.7%
 
u600597.7%
 
2386884.9%
 
0339034.3%
 
l329964.2%
 
n270633.5%
 
M269613.4%
 
a269613.4%
 
y269613.4%
 
3152651.9%
 
889361.1%
 
786231.1%
 
686161.1%
 
981991.0%
 
476721.0%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number34808044.4%
 
Dash Punctuation17404022.2%
 
Lowercase Letter17404022.2%
 
Uppercase Letter8702011.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
112275835.3%
 
59542027.4%
 
23868811.1%
 
0339039.7%
 
3152654.4%
 
889362.6%
 
786232.5%
 
686162.5%
 
981992.4%
 
476722.2%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-174040100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J6005969.0%
 
M2696131.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
u6005934.5%
 
l3299619.0%
 
n2706315.5%
 
a2696115.5%
 
y2696115.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Common52212066.7%
 
Latin26106033.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
-17404033.3%
 
112275823.5%
 
59542018.3%
 
2386887.4%
 
0339036.5%
 
3152652.9%
 
889361.7%
 
786231.7%
 
686161.7%
 
981991.6%
 
476721.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
J6005923.0%
 
u6005923.0%
 
l3299612.6%
 
n2706310.4%
 
M2696110.3%
 
a2696110.3%
 
y2696110.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII783180100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
-17404022.2%
 
112275815.7%
 
59542012.2%
 
J600597.7%
 
u600597.7%
 
2386884.9%
 
0339034.3%
 
l329964.2%
 
n270633.5%
 
M269613.4%
 
a269613.4%
 
y269613.4%
 
3152651.9%
 
889361.1%
 
786231.1%
 
686161.1%
 
981991.0%
 
476721.0%
 

Loan_Amount_Applied
Real number (ℝ≥0)

ZEROS

Distinct277
Distinct (%)0.3%
Missing71
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean230250.6999
Minimum0
Maximum10000000
Zeros28853
Zeros (%)33.2%
Memory size680.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median100000
Q3300000
95-th percentile1000000
Maximum10000000
Range10000000
Interquartile range (IQR)300000

Descriptive statistics

Standard deviation354206.7595
Coefficient of variation (CV)1.538352585
Kurtosis72.20964602
Mean230250.6999
Median Absolute Deviation (MAD)100000
Skewness5.64187128
Sum2.002006811e+10
Variance1.254624285e+11
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02885333.2%
 
1000001431116.4%
 
2000001305815.0%
 
300000999511.5%
 
500000976211.2%
 
100000041954.8%
 
5000012451.4%
 
4000005460.6%
 
1500005400.6%
 
6000003910.4%
 
15000003740.4%
 
7000003430.4%
 
8000002270.3%
 
20000002150.2%
 
600002070.2%
 
2500001920.2%
 
300001580.2%
 
3500001440.2%
 
25000001340.2%
 
700001300.1%
 
200001090.1%
 
12000001050.1%
 
80000890.1%
 
75000890.1%
 
40000880.1%
 
Other values (252)14491.7%
 
ValueCountFrequency (%) 
02885333.2%
 
21< 0.1%
 
411< 0.1%
 
51< 0.1%
 
611< 0.1%
 
710< 0.1%
 
811< 0.1%
 
93< 0.1%
 
101< 0.1%
 
127< 0.1%
 
ValueCountFrequency (%) 
100000001< 0.1%
 
99999991< 0.1%
 
90000002< 0.1%
 
80000002< 0.1%
 
70000007< 0.1%
 
65000002< 0.1%
 
60000006< 0.1%
 
55000002< 0.1%
 
500000034< 0.1%
 
48000001< 0.1%
 

Loan_Tenure_Applied
Real number (ℝ≥0)

ZEROS

Distinct11
Distinct (%)< 0.1%
Missing71
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean2.131398866
Minimum0
Maximum10
Zeros33844
Zeros (%)38.9%
Memory size680.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q34
95-th percentile5
Maximum10
Range10
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.014193118
Coefficient of variation (CV)0.9450099415
Kurtosis-1.430846994
Mean2.131398866
Median Absolute Deviation (MAD)2
Skewness0.264624048
Sum185323
Variance4.056973915
MonotocityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
03384438.9%
 
51908321.9%
 
31308015.0%
 
2946310.9%
 
466207.6%
 
148125.5%
 
1040< 0.1%
 
73< 0.1%
 
62< 0.1%
 
91< 0.1%
 
81< 0.1%
 
(Missing)710.1%
 
ValueCountFrequency (%) 
03384438.9%
 
148125.5%
 
2946310.9%
 
31308015.0%
 
466207.6%
 
51908321.9%
 
62< 0.1%
 
73< 0.1%
 
81< 0.1%
 
91< 0.1%
 
ValueCountFrequency (%) 
1040< 0.1%
 
91< 0.1%
 
81< 0.1%
 
73< 0.1%
 
62< 0.1%
 
51908321.9%
 
466207.6%
 
31308015.0%
 
2946310.9%
 
148125.5%
 

Existing_EMI
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct3753
Distinct (%)4.3%
Missing71
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean3696.227824
Minimum0
Maximum10000000
Zeros58238
Zeros (%)66.9%
Memory size680.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33500
95-th percentile18000
Maximum10000000
Range10000000
Interquartile range (IQR)3500

Descriptive statistics

Standard deviation39810.21192
Coefficient of variation (CV)10.77049733
Kurtosis49764.8527
Mean3696.227824
Median Absolute Deviation (MAD)0
Skewness211.7693511
Sum321383313.1
Variance1584852973
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
05823866.9%
 
500026953.1%
 
1000017372.0%
 
300015811.8%
 
400012261.4%
 
200010971.3%
 
60008371.0%
 
150008000.9%
 
80007860.9%
 
25007270.8%
 
70007180.8%
 
35006010.7%
 
200005210.6%
 
120004800.6%
 
90003610.4%
 
45003400.4%
 
10003230.4%
 
15003230.4%
 
250002820.3%
 
110002770.3%
 
75002550.3%
 
300002410.3%
 
140002170.2%
 
130002090.2%
 
55002030.2%
 
Other values (3728)1187413.6%
 
ValueCountFrequency (%) 
05823866.9%
 
143< 0.1%
 
1.51< 0.1%
 
213< 0.1%
 
37< 0.1%
 
3.52< 0.1%
 
45< 0.1%
 
4.51< 0.1%
 
52< 0.1%
 
63< 0.1%
 
ValueCountFrequency (%) 
100000001< 0.1%
 
54543651< 0.1%
 
6262661< 0.1%
 
4200002< 0.1%
 
3000002< 0.1%
 
2730001< 0.1%
 
2500001< 0.1%
 
2250001< 0.1%
 
2000004< 0.1%
 
1850001< 0.1%
 

Employer_Name
Categorical

HIGH CARDINALITY

Distinct43567
Distinct (%)50.1%
Missing71
Missing (%)0.1%
Memory size680.0 KiB
0
 
4914
TATA CONSULTANCY SERVICES LTD (TCS)
 
550
COGNIZANT TECHNOLOGY SOLUTIONS INDIA PVT LTD
 
404
ACCENTURE SERVICES PVT LTD
 
324
GOOGLE
 
301
Other values (43562)
80456 
ValueCountFrequency (%) 
049145.6%
 
TATA CONSULTANCY SERVICES LTD (TCS)5500.6%
 
COGNIZANT TECHNOLOGY SOLUTIONS INDIA PVT LTD4040.5%
 
ACCENTURE SERVICES PVT LTD3240.4%
 
GOOGLE3010.3%
 
HCL TECHNOLOGIES LTD2500.3%
 
ICICI BANK LTD2390.3%
 
INDIAN AIR FORCE1910.2%
 
INFOSYS TECHNOLOGIES1810.2%
 
GENPACT1790.2%
 
IBM CORPORATION1730.2%
 
INDIAN ARMY1710.2%
 
TYPE SLOWLY FOR AUTO FILL1620.2%
 
WIPRO TECHNOLOGIES1550.2%
 
HDFC BANK LTD1480.2%
 
IKYA HUMAN CAPITAL SOLUTIONS LTD1420.2%
 
STATE GOVERNMENT1340.2%
 
INDIAN RAILWAY1300.1%
 
INDIAN NAVY1280.1%
 
ARMY1260.1%
 
WIPRO BPO1160.1%
 
OTHERS1150.1%
 
CONVERGYS INDIA SERVICES PVT LTD1130.1%
 
TECH MAHINDRA LTD1130.1%
 
SERCO BPO PVT LTD1080.1%
 
Other values (43542)7738288.9%
 
Frequencies of value counts

Unique

Unique33451 ?
Unique (%)38.5%
Histogram of lengths of the category

Length

Max length103
Median length20
Mean length20.54652953
Min length1

Overview of Unicode Properties

Unique unicode characters113
Unique unicode categories16 ?
Unique unicode scripts2 ?
Unique unicode blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
19938611.2%
 
T1545188.6%
 
A1480068.3%
 
I1304927.3%
 
E1276437.1%
 
N1108046.2%
 
S1044475.8%
 
L1036995.8%
 
R913655.1%
 
O887035.0%
 
D872694.9%
 
C669683.7%
 
P569403.2%
 
V462492.6%
 
H424472.4%
 
M409962.3%
 
U377762.1%
 
G314791.8%
 
B198261.1%
 
Y195841.1%
 
F178541.0%
 
K169330.9%
 
W91850.5%
 
J74260.4%
 
.65040.4%
 
Other values (88)214601.2%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter156801487.7%
 
Space Separator19939511.2%
 
Other Punctuation88740.5%
 
Decimal Number63040.4%
 
Open Punctuation16130.1%
 
Close Punctuation15990.1%
 
Lowercase Letter13980.1%
 
Dash Punctuation682< 0.1%
 
Currency Symbol36< 0.1%
 
Control14< 0.1%
 
Modifier Symbol7< 0.1%
 
Other Symbol7< 0.1%
 
Connector Punctuation5< 0.1%
 
Other Number5< 0.1%
 
Math Symbol4< 0.1%
 
Other Letter2< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
T1545189.9%
 
A1480069.4%
 
I1304928.3%
 
E1276438.1%
 
N1108047.1%
 
S1044476.7%
 
L1036996.6%
 
R913655.8%
 
O887035.7%
 
D872695.6%
 
C669684.3%
 
P569403.6%
 
V462492.9%
 
H424472.7%
 
M409962.6%
 
U377762.4%
 
G314792.0%
 
B198261.3%
 
Y195841.2%
 
F178541.1%
 
K169331.1%
 
W91850.6%
 
J74260.5%
 
X40070.3%
 
Z24640.2%
 
Other values (6)9340.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
199386> 99.9%
 
 9< 0.1%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(160399.4%
 
[80.5%
 
{20.1%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)159099.4%
 
]70.4%
 
}20.1%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.650473.3%
 
&170319.2%
 
,4374.9%
 
/1902.1%
 
@170.2%
 
;130.1%
 
:4< 0.1%
 
?2< 0.1%
 
¿2< 0.1%
 
"1< 0.1%
 
#1< 0.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0509480.8%
 
22594.1%
 
42213.5%
 
31943.1%
 
11632.6%
 
71302.1%
 
9731.2%
 
5641.0%
 
8570.9%
 
6490.8%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-682100.0%
 

Most frequent Modifier Symbol characters

ValueCountFrequency (%) 
¸457.1%
 
¨114.3%
 
`114.3%
 
¯114.3%
 

Most frequent Control characters

ValueCountFrequency (%) 
428.6%
 
€214.3%
 
214.3%
 
Ÿ17.1%
 
‹17.1%
 
‡17.1%
 
•17.1%
 
‚17.1%
 
—17.1%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_5100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n27019.3%
 
t17912.8%
 
e16912.1%
 
a15611.2%
 
r1128.0%
 
o1037.4%
 
i584.1%
 
l473.4%
 
m473.4%
 
v453.2%
 
h443.1%
 
d423.0%
 
s292.1%
 
p151.1%
 
y141.0%
 
u130.9%
 
c130.9%
 
g120.9%
 
f80.6%
 
b80.6%
 
w40.3%
 
k40.3%
 
z20.1%
 
x20.1%
 
µ10.1%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
>375.0%
 
¬125.0%
 

Most frequent Currency Symbol characters

ValueCountFrequency (%) 
¤2877.8%
 
¥719.4%
 
$12.8%
 

Most frequent Other Number characters

ValueCountFrequency (%) 
¾480.0%
 
¹120.0%
 

Most frequent Other Symbol characters

ValueCountFrequency (%) 
°342.9%
 
®228.6%
 
¦228.6%
 

Most frequent Other Letter characters

ValueCountFrequency (%) 
ª150.0%
 
º150.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin156941387.8%
 
Common21854612.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
T1545189.8%
 
A1480069.4%
 
I1304928.3%
 
E1276438.1%
 
N1108047.1%
 
S1044476.7%
 
L1036996.6%
 
R913655.8%
 
O887035.7%
 
D872695.6%
 
C669684.3%
 
P569403.6%
 
V462492.9%
 
H424472.7%
 
M409962.6%
 
U377762.4%
 
G314792.0%
 
B198261.3%
 
Y195841.2%
 
F178541.1%
 
K169331.1%
 
W91850.6%
 
J74260.5%
 
X40070.3%
 
Z24640.2%
 
Other values (33)23330.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
19938691.2%
 
.65043.0%
 
050942.3%
 
&17030.8%
 
(16030.7%
 
)15900.7%
 
-6820.3%
 
,4370.2%
 
22590.1%
 
42210.1%
 
31940.1%
 
/1900.1%
 
11630.1%
 
71300.1%
 
973< 0.1%
 
564< 0.1%
 
857< 0.1%
 
649< 0.1%
 
¤28< 0.1%
 
@17< 0.1%
 
;13< 0.1%
 
 9< 0.1%
 
[8< 0.1%
 
]7< 0.1%
 
¥7< 0.1%
 
Other values (30)58< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1787833> 99.9%
 
None126< 0.1%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
19938611.2%
 
T1545188.6%
 
A1480068.3%
 
I1304927.3%
 
E1276437.1%
 
N1108046.2%
 
S1044475.8%
 
L1036995.8%
 
R913655.1%
 
O887035.0%
 
D872694.9%
 
C669683.7%
 
P569403.2%
 
V462492.6%
 
H424472.4%
 
M409962.3%
 
U377762.1%
 
G314791.8%
 
B198261.1%
 
Y195841.1%
 
F178541.0%
 
K169330.9%
 
W91850.5%
 
J74260.4%
 
.65040.4%
 
Other values (59)213341.2%
 

Most frequent None characters

ValueCountFrequency (%) 
À3326.2%
 
¤2822.2%
 
Â107.9%
 
 97.1%
 
¥75.6%
 
¸43.2%
 
¾43.2%
 
43.2%
 
°32.4%
 
€21.6%
 
®21.6%
 
¦21.6%
 
¿21.6%
 
É10.8%
 
¨10.8%
 
Ê10.8%
 
Ÿ10.8%
 
‹10.8%
 
‡10.8%
 
µ10.8%
 
¹10.8%
 
•10.8%
 
‚10.8%
 
¯10.8%
 
ª10.8%
 
Other values (4)43.2%
 

Salary_Account
Categorical

HIGH CARDINALITY
MISSING

Distinct57
Distinct (%)0.1%
Missing11764
Missing (%)13.5%
Memory size680.0 KiB
HDFC Bank
17695 
ICICI Bank
13636 
State Bank of India
11843 
Axis Bank
8783 
Citibank
2376 
Other values (52)
20923 
ValueCountFrequency (%) 
HDFC Bank1769520.3%
 
ICICI Bank1363615.7%
 
State Bank of India1184313.6%
 
Axis Bank878310.1%
 
Citibank23762.7%
 
Kotak Bank20672.4%
 
IDBI Bank15501.8%
 
Punjab National Bank12011.4%
 
Bank of India11701.3%
 
Bank of Baroda11261.3%
 
Standard Chartered Bank9951.1%
 
Canara Bank9901.1%
 
Union Bank of India9511.1%
 
Yes Bank7790.9%
 
ING Vysya6780.8%
 
Corporation bank6490.7%
 
Indian Overseas Bank6120.7%
 
State Bank of Hyderabad5970.7%
 
Indian Bank5550.6%
 
Oriental Bank of Commerce5240.6%
 
IndusInd Bank5030.6%
 
Andhra Bank4850.6%
 
Central Bank of India4450.5%
 
Syndicate Bank4150.5%
 
Bank of Maharasthra4060.5%
 
Other values (32)42254.9%
 
(Missing)1176413.5%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length47
Median length9
Mean length11.09836819
Min length3

Overview of Unicode Properties

Unique unicode characters48
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a14215914.7%
 
n12635013.1%
 
11113211.5%
 
k769968.0%
 
B744367.7%
 
I616506.4%
 
C516455.3%
 
t384414.0%
 
i349163.6%
 
o269742.8%
 
d243392.5%
 
e226702.3%
 
D195942.0%
 
H186281.9%
 
f182601.9%
 
F179631.9%
 
S156211.6%
 
s134571.4%
 
r132891.4%
 
A96161.0%
 
x87830.9%
 
b53590.6%
 
y37570.4%
 
l33540.3%
 
h31610.3%
 
Other values (23)232302.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter57144559.2%
 
Uppercase Letter28263929.3%
 
Space Separator11113211.5%
 
Other Punctuation456< 0.1%
 
Dash Punctuation108< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B7443626.3%
 
I6165021.8%
 
C5164518.3%
 
D195946.9%
 
H186286.6%
 
F179636.4%
 
S156215.5%
 
A96163.4%
 
K26560.9%
 
N19580.7%
 
P14600.5%
 
O13750.5%
 
U13710.5%
 
V13060.5%
 
Y7790.3%
 
M7320.3%
 
G6900.2%
 
J3900.1%
 
T3810.1%
 
L3000.1%
 
R88< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
111132100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a14215924.9%
 
n12635022.1%
 
k7699613.5%
 
t384416.7%
 
i349166.1%
 
o269744.7%
 
d243394.3%
 
e226704.0%
 
f182603.2%
 
s134572.4%
 
r132892.3%
 
x87831.5%
 
b53590.9%
 
y37570.7%
 
l33540.6%
 
h31610.6%
 
u29120.5%
 
j15240.3%
 
c13860.2%
 
m12280.2%
 
p10880.2%
 
v8390.1%
 
w195< 0.1%
 
g8< 0.1%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
&456100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-108100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin85408488.4%
 
Common11169611.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a14215916.6%
 
n12635014.8%
 
k769969.0%
 
B744368.7%
 
I616507.2%
 
C516456.0%
 
t384414.5%
 
i349164.1%
 
o269743.2%
 
d243392.8%
 
e226702.7%
 
D195942.3%
 
H186282.2%
 
f182602.1%
 
F179632.1%
 
S156211.8%
 
s134571.6%
 
r132891.6%
 
A96161.1%
 
x87831.0%
 
b53590.6%
 
y37570.4%
 
l33540.4%
 
h31610.4%
 
u29120.3%
 
Other values (20)197542.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
11113299.5%
 
&4560.4%
 
-1080.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII965780100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a14215914.7%
 
n12635013.1%
 
11113211.5%
 
k769968.0%
 
B744367.7%
 
I616506.4%
 
C516455.3%
 
t384414.0%
 
i349163.6%
 
o269742.8%
 
d243392.5%
 
e226702.3%
 
D195942.0%
 
H186281.9%
 
f182601.9%
 
F179631.9%
 
S156211.6%
 
s134571.4%
 
r132891.4%
 
A96161.0%
 
x87830.9%
 
b53590.6%
 
y37570.4%
 
l33540.3%
 
h31610.3%
 
Other values (23)232302.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
Y
56481 
N
30539 
ValueCountFrequency (%) 
Y5648164.9%
 
N3053935.1%
 

Var5
Real number (ℝ≥0)

ZEROS

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.961503103
Minimum0
Maximum18
Zeros29087
Zeros (%)33.4%
Memory size680.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q311
95-th percentile15
Maximum18
Range18
Interquartile range (IQR)11

Descriptive statistics

Standard deviation5.670384977
Coefficient of variation (CV)1.142876435
Kurtosis-0.987668742
Mean4.961503103
Median Absolute Deviation (MAD)2
Skewness0.7606063211
Sum431750
Variance32.15326579
MonotocityNot monotonic
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%) 
02908733.4%
 
11223614.1%
 
367597.8%
 
1152046.0%
 
244855.2%
 
1436624.2%
 
1535094.0%
 
1229893.4%
 
1326223.0%
 
825152.9%
 
1024272.8%
 
922812.6%
 
1620972.4%
 
418152.1%
 
1716911.9%
 
714891.7%
 
69831.1%
 
59751.1%
 
181940.2%
 
ValueCountFrequency (%) 
02908733.4%
 
11223614.1%
 
244855.2%
 
367597.8%
 
418152.1%
 
59751.1%
 
69831.1%
 
714891.7%
 
825152.9%
 
922812.6%
 
ValueCountFrequency (%) 
181940.2%
 
1716911.9%
 
1620972.4%
 
1535094.0%
 
1436624.2%
 
1326223.0%
 
1229893.4%
 
1152046.0%
 
1024272.8%
 
922812.6%
 

Var1
Categorical

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
HBXX
59294 
HBXC
9010 
HBXB
 
4479
HAXA
 
2909
HBXA
 
2123
Other values (14)
9205 
ValueCountFrequency (%) 
HBXX5929468.1%
 
HBXC901010.4%
 
HBXB44795.1%
 
HAXA29093.3%
 
HBXA21232.4%
 
HAXB20112.3%
 
HBXD19642.3%
 
HAXC15361.8%
 
HBXH9701.1%
 
HCXF7220.8%
 
HAYT5080.6%
 
HAVC3840.4%
 
HAXM2680.3%
 
HCXD2370.3%
 
HCYS2170.2%
 
HVYS1860.2%
 
HAZD1090.1%
 
HCXG780.1%
 
HAXF15< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length4
Min length4

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
X14491041.6%
 
H8799025.3%
 
B8433024.2%
 
A127723.7%
 
C121843.5%
 
D23100.7%
 
Y9110.3%
 
F7370.2%
 
V5700.2%
 
T5080.1%
 
S4030.1%
 
M2680.1%
 
Z109< 0.1%
 
G78< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter348080100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
X14491041.6%
 
H8799025.3%
 
B8433024.2%
 
A127723.7%
 
C121843.5%
 
D23100.7%
 
Y9110.3%
 
F7370.2%
 
V5700.2%
 
T5080.1%
 
S4030.1%
 
M2680.1%
 
Z109< 0.1%
 
G78< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin348080100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
X14491041.6%
 
H8799025.3%
 
B8433024.2%
 
A127723.7%
 
C121843.5%
 
D23100.7%
 
Y9110.3%
 
F7370.2%
 
V5700.2%
 
T5080.1%
 
S4030.1%
 
M2680.1%
 
Z109< 0.1%
 
G78< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII348080100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
X14491041.6%
 
H8799025.3%
 
B8433024.2%
 
A127723.7%
 
C121843.5%
 
D23100.7%
 
Y9110.3%
 
F7370.2%
 
V5700.2%
 
T5080.1%
 
S4030.1%
 
M2680.1%
 
Z109< 0.1%
 
G78< 0.1%
 

Loan_Amount_Submitted
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct203
Distinct (%)0.4%
Missing34613
Missing (%)39.8%
Infinite0
Infinite (%)0.0%
Mean395010.5902
Minimum50000
Maximum3000000
Zeros0
Zeros (%)0.0%
Memory size680.0 KiB

Quantile statistics

Minimum50000
5-th percentile100000
Q1200000
median300000
Q3500000
95-th percentile1000000
Maximum3000000
Range2950000
Interquartile range (IQR)300000

Descriptive statistics

Standard deviation308248.1363
Coefficient of variation (CV)0.7803541067
Kurtosis6.489087539
Mean395010.5902
Median Absolute Deviation (MAD)150000
Skewness2.104983545
Sum2.070132e+10
Variance9.50169135e+10
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10000068847.9%
 
20000065837.6%
 
30000053856.2%
 
50000048495.6%
 
100000016441.9%
 
40000012291.4%
 
29000010391.2%
 
3500008200.9%
 
3600008160.9%
 
4200007740.9%
 
3400007380.8%
 
1500007370.8%
 
3300007340.8%
 
4500007280.8%
 
3200006550.8%
 
15000006520.7%
 
3900006010.7%
 
2400005800.7%
 
12000005030.6%
 
2200004910.6%
 
1900004890.6%
 
2500004890.6%
 
3700004670.5%
 
3800004200.5%
 
6000004030.5%
 
Other values (178)1369715.7%
 
(Missing)3461339.8%
 
ValueCountFrequency (%) 
500003520.4%
 
600001990.2%
 
700002290.3%
 
800001840.2%
 
900001650.2%
 
10000068847.9%
 
1100001500.2%
 
1200002300.3%
 
1300002150.2%
 
1400001720.2%
 
ValueCountFrequency (%) 
30000007< 0.1%
 
28800001< 0.1%
 
26400001< 0.1%
 
25700001< 0.1%
 
2500000470.1%
 
24800001< 0.1%
 
24700001< 0.1%
 
24600001< 0.1%
 
24100001< 0.1%
 
24000001< 0.1%
 

Loan_Tenure_Submitted
Real number (ℝ≥0)

MISSING

Distinct6
Distinct (%)< 0.1%
Missing34613
Missing (%)39.8%
Infinite0
Infinite (%)0.0%
Mean3.891369474
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size680.0 KiB

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q35
95-th percentile5
Maximum6
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.16535892
Coefficient of variation (CV)0.2994726993
Kurtosis-0.2376391343
Mean3.891369474
Median Absolute Deviation (MAD)1
Skewness-0.8433232334
Sum203935
Variance1.358061413
MonotocityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
52076523.9%
 
41513517.4%
 
3885810.2%
 
253326.1%
 
123142.7%
 
63< 0.1%
 
(Missing)3461339.8%
 
ValueCountFrequency (%) 
123142.7%
 
253326.1%
 
3885810.2%
 
41513517.4%
 
52076523.9%
 
63< 0.1%
 
ValueCountFrequency (%) 
63< 0.1%
 
52076523.9%
 
41513517.4%
 
3885810.2%
 
253326.1%
 
123142.7%
 

Interest_Rate
Real number (ℝ≥0)

MISSING

Distinct73
Distinct (%)0.3%
Missing59294
Missing (%)68.1%
Infinite0
Infinite (%)0.0%
Mean19.19747421
Minimum11.99
Maximum37
Zeros0
Zeros (%)0.0%
Memory size680.0 KiB

Quantile statistics

Minimum11.99
5-th percentile13.5
Q115.25
median18
Q320
95-th percentile31.5
Maximum37
Range25.01
Interquartile range (IQR)4.75

Descriptive statistics

Standard deviation5.834213258
Coefficient of variation (CV)0.3039052531
Kurtosis1.14015201
Mean19.19747421
Median Absolute Deviation (MAD)2.5
Skewness1.430301188
Sum532269.17
Variance34.03804434
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2047075.4%
 
14.8520162.3%
 
13.9916992.0%
 
31.516961.9%
 
15.2515531.8%
 
16.7515181.7%
 
18.2513121.5%
 
15.512921.5%
 
28.59501.1%
 
18.48000.9%
 
136600.8%
 
246490.7%
 
196250.7%
 
15.755570.6%
 
13.55210.6%
 
18.155060.6%
 
35.54930.6%
 
184740.5%
 
174160.5%
 
16.253700.4%
 
17.53590.4%
 
18.53150.4%
 
373020.3%
 
14.492920.3%
 
13.492750.3%
 
Other values (48)33693.9%
 
(Missing)5929468.1%
 
ValueCountFrequency (%) 
11.99900.1%
 
12.991910.2%
 
136600.8%
 
13.25870.1%
 
13.492750.3%
 
13.55210.6%
 
13.752550.3%
 
13.9916992.0%
 
144< 0.1%
 
14.252620.3%
 
ValueCountFrequency (%) 
373020.3%
 
35.54930.6%
 
331580.2%
 
32.52120.2%
 
31.516961.9%
 
31560.1%
 
30.513< 0.1%
 
29.526< 0.1%
 
29460.1%
 
28.59501.1%
 

Processing_Fee
Real number (ℝ≥0)

MISSING

Distinct571
Distinct (%)2.1%
Missing59600
Missing (%)68.5%
Infinite0
Infinite (%)0.0%
Mean5131.150839
Minimum200
Maximum50000
Zeros0
Zeros (%)0.0%
Memory size680.0 KiB

Quantile statistics

Minimum200
5-th percentile1000
Q12000
median4000
Q36250
95-th percentile14000
Maximum50000
Range49800
Interquartile range (IQR)4250

Descriptive statistics

Standard deviation4725.837644
Coefficient of variation (CV)0.9210093003
Kurtosis10.58856672
Mean5131.150839
Median Absolute Deviation (MAD)2000
Skewness2.680108856
Sum140696156
Variance22333541.44
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
200030743.5%
 
100020672.4%
 
400020062.3%
 
300012861.5%
 
600011831.4%
 
1000010931.3%
 
15006410.7%
 
50005840.7%
 
25005520.6%
 
45004680.5%
 
38003190.4%
 
29003170.4%
 
36002960.3%
 
42002870.3%
 
44002820.3%
 
33002760.3%
 
32002670.3%
 
35002640.3%
 
16002560.3%
 
80002480.3%
 
48002410.3%
 
68002410.3%
 
24002310.3%
 
75002280.3%
 
58002230.3%
 
Other values (546)1049012.1%
 
(Missing)5960068.5%
 
ValueCountFrequency (%) 
2001< 0.1%
 
25019< 0.1%
 
3006< 0.1%
 
3251< 0.1%
 
3507< 0.1%
 
3752< 0.1%
 
40015< 0.1%
 
4503< 0.1%
 
4801< 0.1%
 
5001960.2%
 
ValueCountFrequency (%) 
500003< 0.1%
 
454001< 0.1%
 
4000015< 0.1%
 
386001< 0.1%
 
376001< 0.1%
 
375006< 0.1%
 
3600024< 0.1%
 
344001< 0.1%
 
340005< 0.1%
 
338001< 0.1%
 

EMI_Loan_Submitted
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct4530
Distinct (%)16.3%
Missing59294
Missing (%)68.1%
Infinite0
Infinite (%)0.0%
Mean10999.52838
Minimum1176.41
Maximum144748.28
Zeros0
Zeros (%)0.0%
Memory size680.0 KiB

Quantile statistics

Minimum1176.41
5-th percentile3447.1
Q16491.6
median9392.97
Q312919.04
95-th percentile25444.4125
Maximum144748.28
Range143571.87
Interquartile range (IQR)6427.44

Descriptive statistics

Standard deviation7512.32305
Coefficient of variation (CV)0.6829677412
Kurtosis16.8985671
Mean10999.52838
Median Absolute Deviation (MAD)3306.9
Skewness2.754955411
Sum304972923.8
Variance56434997.6
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3716.362880.3%
 
7948.172520.3%
 
5089.582400.3%
 
5298.782290.3%
 
8742.982180.3%
 
7432.722150.2%
 
10597.552140.2%
 
7683.231830.2%
 
2649.391770.2%
 
8852.071550.2%
 
11855.631400.2%
 
11960.681360.2%
 
4327.731350.2%
 
12026.61330.2%
 
11631.531320.2%
 
11947.211180.1%
 
13246.941140.1%
 
7007.891090.1%
 
6086.071030.1%
 
9537.81020.1%
 
9037.631000.1%
 
5668.78990.1%
 
7745.56970.1%
 
10696.25960.1%
 
11149.08950.1%
 
Other values (4505)2384627.4%
 
(Missing)5929468.1%
 
ValueCountFrequency (%) 
1176.411< 0.1%
 
1185.565< 0.1%
 
1196.075< 0.1%
 
1202.664< 0.1%
 
1222.552< 0.1%
 
1235.921< 0.1%
 
1256.111< 0.1%
 
1269.671< 0.1%
 
1273.751< 0.1%
 
1317.753< 0.1%
 
ValueCountFrequency (%) 
144748.281< 0.1%
 
135564.482< 0.1%
 
97211.021< 0.1%
 
87489.921< 0.1%
 
79306.651< 0.1%
 
67291.721< 0.1%
 
67140.833< 0.1%
 
66696.844< 0.1%
 
66234.712< 0.1%
 
63917.811< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
N
67530 
Y
19490 
ValueCountFrequency (%) 
N6753077.6%
 
Y1949022.4%
 

Device_Type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
Web-browser
64316 
Mobile
22704 
ValueCountFrequency (%) 
Web-browser6431673.9%
 
Mobile2270426.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length11
Median length11
Mean length9.695472305
Min length6

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e15133617.9%
 
b15133617.9%
 
r12863215.2%
 
o8702010.3%
 
W643167.6%
 
-643167.6%
 
w643167.6%
 
s643167.6%
 
M227042.7%
 
i227042.7%
 
l227042.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter69236482.1%
 
Uppercase Letter8702010.3%
 
Dash Punctuation643167.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
W6431673.9%
 
M2270426.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e15133621.9%
 
b15133621.9%
 
r12863218.6%
 
o8702012.6%
 
w643169.3%
 
s643169.3%
 
i227043.3%
 
l227043.3%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-64316100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin77938492.4%
 
Common643167.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e15133619.4%
 
b15133619.4%
 
r12863216.5%
 
o8702011.2%
 
W643168.3%
 
w643168.3%
 
s643168.3%
 
M227042.9%
 
i227042.9%
 
l227042.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
-64316100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII843700100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e15133617.9%
 
b15133617.9%
 
r12863215.2%
 
o8702010.3%
 
W643167.6%
 
-643167.6%
 
w643167.6%
 
s643167.6%
 
M227042.7%
 
i227042.7%
 
l227042.7%
 

Var2
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
B
37280 
G
33032 
C
14210 
E
 
1315
D
 
634
Other values (2)
 
549
ValueCountFrequency (%) 
B3728042.8%
 
G3303238.0%
 
C1421016.3%
 
E13151.5%
 
D6340.7%
 
F5440.6%
 
A5< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters7
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
B3728042.8%
 
G3303238.0%
 
C1421016.3%
 
E13151.5%
 
D6340.7%
 
F5440.6%
 
A5< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter87020100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B3728042.8%
 
G3303238.0%
 
C1421016.3%
 
E13151.5%
 
D6340.7%
 
F5440.6%
 
A5< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin87020100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
B3728042.8%
 
G3303238.0%
 
C1421016.3%
 
E13151.5%
 
D6340.7%
 
F5440.6%
 
A5< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII87020100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
B3728042.8%
 
G3303238.0%
 
C1421016.3%
 
E13151.5%
 
D6340.7%
 
F5440.6%
 
A5< 0.1%
 

Source
Categorical

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
S122
38567 
S133
29885 
S159
5599 
S143
4332 
S127
 
1931
Other values (25)
6706 
ValueCountFrequency (%) 
S1223856744.3%
 
S1332988534.3%
 
S15955996.4%
 
S14343325.0%
 
S12719312.2%
 
S13717242.0%
 
S13413011.5%
 
S1617690.9%
 
S1517200.8%
 
S1576500.7%
 
S1534940.6%
 
S1563080.4%
 
S1442990.3%
 
S1582080.2%
 
S123730.1%
 
S141570.1%
 
S16236< 0.1%
 
S12424< 0.1%
 
S16011< 0.1%
 
S15010< 0.1%
 
S1554< 0.1%
 
S1383< 0.1%
 
S1363< 0.1%
 
S1293< 0.1%
 
S1393< 0.1%
 
Other values (5)6< 0.1%
 
Frequencies of value counts

Unique

Unique4 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length4
Min length4

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories2 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
18856625.4%
 
S8702025.0%
 
27920222.8%
 
36770619.5%
 
580012.3%
 
463141.8%
 
956051.6%
 
743051.2%
 
611270.3%
 
82110.1%
 
023< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number26106075.0%
 
Uppercase Letter8702025.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S87020100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
18856633.9%
 
27920230.3%
 
36770625.9%
 
580013.1%
 
463142.4%
 
956052.1%
 
743051.6%
 
611270.4%
 
82110.1%
 
023< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common26106075.0%
 
Latin8702025.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
S87020100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
18856633.9%
 
27920230.3%
 
36770625.9%
 
580013.1%
 
463142.4%
 
956052.1%
 
743051.6%
 
611270.4%
 
82110.1%
 
023< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII348080100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
18856625.4%
 
S8702025.0%
 
27920222.8%
 
36770619.5%
 
580012.3%
 
463141.8%
 
956051.6%
 
743051.2%
 
611270.3%
 
82110.1%
 
023< 0.1%
 

Var4
Real number (ℝ≥0)

ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.949804643
Minimum0
Maximum7
Zeros2546
Zeros (%)2.9%
Memory size680.0 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median3
Q35
95-th percentile5
Maximum7
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.69771985
Coefficient of variation (CV)0.5755363678
Kurtosis-0.8576087191
Mean2.949804643
Median Absolute Deviation (MAD)2
Skewness0.2211281429
Sum256692
Variance2.882252688
MonotocityNot monotonic
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
32526029.0%
 
12390627.5%
 
52026623.3%
 
465777.6%
 
259316.8%
 
025462.9%
 
723022.6%
 
62320.3%
 
ValueCountFrequency (%) 
025462.9%
 
12390627.5%
 
259316.8%
 
32526029.0%
 
465777.6%
 
52026623.3%
 
62320.3%
 
723022.6%
 
ValueCountFrequency (%) 
723022.6%
 
62320.3%
 
52026623.3%
 
465777.6%
 
32526029.0%
 
259316.8%
 
12390627.5%
 
025462.9%
 

LoggedIn
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
0
84466 
1
 
2554
ValueCountFrequency (%) 
08446697.1%
 
125542.9%
 

Disbursed
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size680.0 KiB
0
85747 
1
 
1273
ValueCountFrequency (%) 
08574798.5%
 
112731.5%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

IDGenderCityMonthly_IncomeDOBLead_Creation_DateLoan_Amount_AppliedLoan_Tenure_AppliedExisting_EMIEmployer_NameSalary_AccountMobile_VerifiedVar5Var1Loan_Amount_SubmittedLoan_Tenure_SubmittedInterest_RateProcessing_FeeEMI_Loan_SubmittedFilled_FormDevice_TypeVar2SourceVar4LoggedInDisbursed
0ID000002C20FemaleDelhi2000023-May-7815-May-15300000.05.00.0CYBOSOLHDFC BankN0HBXXNaNNaNNaNNaNNaNNWeb-browserGS122100
1ID000004E40MaleMumbai3500007-Oct-8504-May-15200000.02.00.0TATA CONSULTANCY SERVICES LTD (TCS)ICICI BankY13HBXA200000.02.013.25NaN6762.90NWeb-browserGS122300
2ID000007H20MalePanchkula2250010-Oct-8119-May-15600000.04.00.0ALCHEMIST HOSPITALS LTDState Bank of IndiaY0HBXX450000.04.0NaNNaNNaNNWeb-browserBS143100
3ID000008I30MaleSaharsa3500030-Nov-8709-May-151000000.05.00.0BIHAR GOVERNMENTState Bank of IndiaY10HBXX920000.05.0NaNNaNNaNNWeb-browserBS143300
4ID000009J40MaleBengaluru10000017-Feb-8420-May-15500000.02.025000.0GLOBAL EDGE SOFTWAREHDFC BankY17HBXX500000.02.0NaNNaNNaNNWeb-browserBS134310
5ID000010K00MaleBengaluru4500021-Apr-8220-May-15300000.05.015000.0COGNIZANT TECHNOLOGY SOLUTIONS INDIA PVT LTDHSBCY17HAXM300000.05.013.991500.06978.92NWeb-browserBS143310
6ID000011L10FemaleSindhudurg7000023-Oct-8701-May-156.05.00.0CARNIVAL CRUISE LINEYes BankN0HBXXNaNNaNNaNNaNNaNNWeb-browserBS133100
7ID000012M20MaleBengaluru2000025-Jul-7520-May-15200000.05.02597.0GOLDEN TULIP FLORITECH PVT. LTDNaNY3HBXX200000.05.0NaNNaNNaNNWeb-browserBS159300
8ID000013N30MaleKochi7500026-Jan-7202-May-150.00.00.0SIIS PVT LTDState Bank of IndiaY13HAXB1300000.05.014.8526000.030824.65YMobileCS122500
9ID000014O40FemaleMumbai3000012-Sep-8903-May-15300000.03.00.0SOUNDCLOUD.COMKotak BankY0HBXC300000.03.018.251500.010883.38NWeb-browserBS133100

Last rows

IDGenderCityMonthly_IncomeDOBLead_Creation_DateLoan_Amount_AppliedLoan_Tenure_AppliedExisting_EMIEmployer_NameSalary_AccountMobile_VerifiedVar5Var1Loan_Amount_SubmittedLoan_Tenure_SubmittedInterest_RateProcessing_FeeEMI_Loan_SubmittedFilled_FormDevice_TypeVar2SourceVar4LoggedInDisbursed
87010ID124806G10MaleNagpur2800010-Jun-7331-Jul-150.00.00.0UTTAM VALUE STEEL LTD,WARDHACentral Bank of IndiaY2HAXB500000.04.014.8510000.013877.39YMobileGS122500
87011ID124808I30MaleBengaluru1500001-Jun-9031-Jul-150.00.00.0AIRTELKarnataka BankY1HBXX240000.04.0NaNNaNNaNNMobileGS122300
87012ID124810K00MaleBengaluru4600002-Jan-8531-Jul-15300000.03.00.0COGNIZANT TECHNOLOGY SOLUTIONS INDIA PVT LTDHDFC BankY15HBXC300000.03.013.002400.010108.19NWeb-browserGS122400
87013ID124811L10MaleSecunderabad2400001-Jan-9031-Jul-15300000.03.00.0INDIAN AIR FORCEState Bank of IndiaY2HBXX300000.03.0NaNNaNNaNNWeb-browserGS122300
87014ID124812M20FemalePune4900031-May-8231-Jul-15400000.05.00.0INFOSYS TECHNOLOGIESICICI BankN14HBXXNaNNaNNaNNaNNaNNWeb-browserGS122300
87015ID124813N30FemaleAjmer7190127-Nov-6931-Jul-151000000.05.014500.0MAYO COLLEGEICICI BankN9HBXXNaNNaNNaNNaNNaNNWeb-browserGS122300
87016ID124814O40FemaleKochi1600001-Dec-9031-Jul-150.00.00.0KERALA COMMUNICATORS CABLE LTDFederal BankY1HBXB240000.04.035.504800.09425.76YMobileGS122500
87017ID124816Q10MaleBengaluru11800028-Jan-7231-Jul-150.00.00.0BANGALORE INSTITUTE OF TECHNOLOGYSyndicate BankY8HBXX1200000.04.0NaNNaNNaNNMobileGS122300
87018ID124818S30MaleBengaluru9893027-Apr-7731-Jul-15800000.05.013660.0FIRSTSOURCE SOLUTION LTDICICI BankY18HBXX800000.05.0NaNNaNNaNNWeb-browserGS122300
87019ID124821V10MaleMumbai4230031-Oct-8831-Jul-150.00.00.0GOVERNMENT OF INDIANaNY12HBXA690000.04.013.993450.018851.81NWeb-browserGS122400